40 research outputs found
A modulation property of time-frequency derivatives of filtered phase and its application to aperiodicity and fo estimation
We introduce a simple and linear SNR (strictly speaking, periodic to random
power ratio) estimator (0dB to 80dB without additional
calibration/linearization) for providing reliable descriptions of aperiodicity
in speech corpus. The main idea of this method is to estimate the background
random noise level without directly extracting the background noise. The
proposed method is applicable to a wide variety of time windowing functions
with very low sidelobe levels. The estimate combines the frequency derivative
and the time-frequency derivative of the mapping from filter center frequency
to the output instantaneous frequency. This procedure can replace the
periodicity detection and aperiodicity estimation subsystems of recently
introduced open source vocoder, YANG vocoder. Source code of MATLAB
implementation of this method will also be open sourced.Comment: 8 pages 9 figures, Submitted and accepted in Interspeech201
Real-time and interactive tools for vocal training based on an analytic signal with a cosine series envelope
We introduce real-time and interactive tools for assisting vocal training. In
this presentation, we demonstrate mainly a tool based on real-time visualizer
of fundamental frequency candidates to provide information-rich feedback to
learners. The visualizer uses an efficient algorithm using analytic signals for
deriving phase-based attributes. We start using these tools in vocal training
for assisting learners to acquire the awareness of appropriate vocalization.
The first author made the MATLAB implementation of the tools open-source. The
code and associated video materials are accessible in the first author's GitHub
repository.Comment: 4 pages, 6 figures, APSIPA ASC 201
Simultaneous Measurement of Multiple Acoustic Attributes Using Structured Periodic Test Signals Including Music and Other Sound Materials
We introduce a general framework for measuring acoustic properties such as
liner time-invariant (LTI) response, signal-dependent time-invariant (SDTI)
component, and random and time-varying (RTV) component simultaneously using
structured periodic test signals. The framework also enables music pieces and
other sound materials as test signals by "safeguarding" them by adding slight
deterministic "noise." Measurement using swept-sin, MLS (Maxim Length
Sequence), and their variants are special cases of the proposed framework. We
implemented interactive and real-time measuring tools based on this framework
and made them open-source. Furthermore, we applied this framework to assess
pitch extractors objectively.Comment: 8 pages, 17 figures, accepted for APSIPA ASC 202
An objective test tool for pitch extractors' response attributes
We propose an objective measurement method for pitch extractors' responses to
frequency-modulated signals. It enables us to evaluate different pitch
extractors with unified criteria. The method uses extended time-stretched
pulses combined by binary orthogonal sequences. It provides simultaneous
measurement results consisting of the linear and the non-linear time-invariant
responses and random and time-varying responses. We tested representative pitch
extractors using fundamental frequencies spanning 80~Hz to 400~Hz with 1/48
octave steps and produced more than 1000 modulation frequency response plots.
We found that making scientific visualization by animating these plots enables
us to understand different pitch extractors' behavior at once. Such efficient
and effortless inspection is impossible by inspecting all individual plots. The
proposed measurement method with visualization leads to further improvement of
the performance of one of the extractors mentioned above. In other words, our
procedure turns the specific pitch extractor into the best reliable measuring
equipment that is crucial for scientific research. We open-sourced MATLAB codes
of the proposed objective measurement method and visualization procedure.Comment: 5 pages, 9 figures, submitted to Interspeech2022. arXiv admin note:
text overlap with arXiv:2111.0362
Asymmetric Synthesis of α‐Amino Phosphonic Acids using Stable Imino Phosphonate as a Universal Precursor
A practical method for synthesizing chiral α-amino phosphonic acid derivatives was developed. Readily available and stable N-o-nitrophenylsulfenyl (Nps) imino phosphonate was utilized as a substrate for a highly enantioselective Friedel–Crafts-type addition of indole or pyrrole nucleophiles catalyzed by chiral phosphoric acid. The resulting adduct was easily converted to N-9-fluorenylmethyloxycarbonyl (Fmoc) amino phosphonic acid, which is useful for synthesizing peptides containing an amino phosphonic acid
Three-dimensional structure of monoanionic methionine-enkephalin: X-ray structure of tert-butyloxycarbonyl-Tyr-Gly-Gly-(4-bromo)Phe-Met-OH
AbstractThe conformation of tert-butyloxycarbonyl-Tyr-Gly-Gly-(4-bromo)Phe-Met-OH, as a monoanionic derivative of Met-enkephalin, was elucidated by X-ray crystal analysis. The molecule took an extended conformation which was bended at the Phe residue. The implication of the dimer formation caused by 4 intermolecular hydrogen bonds was discussed in the relation with the opiate receptor
Frequency domain variant of Velvet noise and its application to acoustic measurements
We propose a new family of test signals for acoustic measurements such as
impulse response, nonlinearity, and the effects of background noise. The
proposed family complements difficulties in existing families, the Swept-Sine
(SS), pseudo-random noise such as the maximum length sequence (MLS). The
proposed family uses the frequency domain variant of the Velvet noise (FVN) as
its building block. An FVN is an impulse response of an all-pass filter and
yields the unit impulse when convolved with the time-reversed version of
itself. In this respect, FVN is a member of the time-stretched pulse (TSP) in
the broadest sense. The high degree of freedom in designing an FVN opens a vast
range of applications in acoustic measurement. We introduce the following
applications and their specific procedures, among other possibilities. They are
as follows. a) Spectrum shaping adaptive to background noise. b) Simultaneous
measurement of impulse responses of multiple acoustic paths. d) Simultaneous
measurement of linear and nonlinear components of an acoustic path. e)
Automatic procedure for time axis alignment of the source and the receiver when
they are using independent clocks in acoustic impulse response measurement. We
implemented a reference measurement tool equipped with all these procedures.
The MATLAB source code and related materials are open-sourced and placed in a
GitHub repository.Comment: 10 pages, 14 figures, APSIPA ASC 2019. arXiv admin note: text overlap
with arXiv:1806.0681
Are you laughing, smiling or crying?
Acoustic, articulatory, and perceptual analyses of spontaneous laughing, smiling, and crying speech were done in comparison with neutral speech. Listeners were asked to rate the emotional intensity and identify the emotion as happy, sad, or neutral (or other/unknown) of auditorily presented (a) phrases and (b) single words. The results show acoustic, articulatory and perceptual similarities for laughing, smiling and crying speech; smiling speech was sometimes judged as sad. Utterances rated as emotionally intense (whether laughing, smiling, or crying speech) are characterized by high F0, high F2 and low H2 (dB) (especially for happy), and tended to be produced with raised/retracted upper lip, and lowered tongue dorsum. Possible reasons for the phonetic similarities in such divergent types of emotional expressions, e.g., laughing, smiling and crying, are discussed. Also, discussed are possible reasons why phonetic characteristics of speech intended by the speaker to be emotional are different from those perceived by listeners.APSIPA ASC 2009: Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference. 4-7 October 2009. Sapporo, Japan. Oral session: Synthesis of Various Affective Speech Based on Knowledge of Human (6 October 2009)